Scikit Learn Machine Learning Tutorial p7

by: xtianvillaruz, 6 years ago

Last edited: 6 years ago

Hello Harrison and to everyone who wants to help me,

I had this message after i ran the Scikit Learn Machine Learning Tutorial p7 code from my jupyter notebook:

FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
result = op(self.values, np.asarray(other))

and the TotalDebtEquitymrq.csv is empty.

 import pandas as pd
import os
import time
from datetime import datetime

path = 'intraQuarter'


def Key_Stats(gather='Total Debt/Equity (mrq)'):
    statspath = path + '/_KeyStats'
    stock_list = [x[0] for x in os.walk(statspath)]
    df = pd.DataFrame(columns =['Date','Unix','Ticker','DE Ratio'])
    
    sp500_df = pd.read_csv('YAHOO_SP500.csv')
    
    for each_dir in stock_list[1:25]:
        each_file = os.listdir(each_dir)
        ticker = each_dir.split('')[1]
        if len(each_file) > 0:
            for file in each_file:
                date_stamp = datetime.strptime(file, '%Y%m%d%H%M%S.html')
                unix_time = time.mktime(date_stamp.timetuple())
                full_file_path = each_dir + '/' + file
                source = open(full_file_path, 'r').read()
                try:
                    value = float(source.split(gather+':</td><td class="yfnc_tabledata1">')[1].split('</td>')[0])
                    
                    try:
                        sp500_date = datetime.fromtimestamp(unix_time).strftime('%d/ %m/ %Y')
                        row = sp500_df[(sp500_df.index == sp500_date)]
                        sp500_value = float(row['Adj Close'])
                    except:
                        sp500_date = datetime.fromtimestamp(unix_time-259200).strftime('%d/ %m/ %Y')
                        row = sp500_df[(sp500_df.index == sp500_date)]
                        sp500_value = float(row['Adj Close'])
                    
                    stock_price = float(source.split('</small><big><b>')[1].split('</b></big>')[0])
                    
                    df = df.append({'Date':date_stamp,'Unix':unix_time,'Ticker':ticker,'DE Ratio':value,'Price':stock_price, 'SP500': sp500_value}, ignore_index = True)
                
                except Exception as e:
                    pass
        
        save = gather.replace(' ','').replace(')','').replace('(','').replace('/','')+('.csv')
        print(save)
        df.to_csv(save)
                
        time.sleep(15)
    
Key_Stats()



What does it mean?


Since S&P500 index yahoo finance has no longer available in Quandl i downloaded a new file at yahoo finance, which has a date format of dd/mm/YYYY where in the tutorial YYYY-mm-dd. I think this might be the culprit, i tried to search and read the date stamp documentation and modified code but still unsuccessful.

hope you can help me.

Thanks..


xtian



You must be logged in to post. Please login or register an account.



Sorry guys...

I forgot my pandas lesson.
I already resolved the issue, the missing scrip is:

sp500_df = pd.read_csv('YAHOO_SP500.csv', parse_dates=True, index_col = 0)


The error caused by the index value, that is way the TotalDebtEquitymrq.csv is empty. Also the tutorial video was made last 2014 and using older version of pandas.

But still pythonprogramming.net is very helpful and free....

thanks Harrison.

-xtianvillaruz 6 years ago

You must be logged in to post. Please login or register an account.